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Abstract — Multipath interference is an ubiquitous phenomenon 
in modern communication systems. The conventional way to 
compensate for this effect is to equalize the channel by estimating 
its impulse response by transmitting a set of training symbols. 
The primary drawback to this type of approach is that it 
can be unreliable if the channel is changing rapidly. In this 
paper, we show that randomly encoding the signal can protect it 
against channel uncertainty when the channel is sparse. Before 
transmission, the signal is mapped into a slightly longer codeword 
using a random matrix. From the received signal, we are able to 
simultaneously estimate the channel and recover the transmitted 
signal. We discuss two schemes for the recovery. Both of them 
exploit the sparsity of the underlying channel. We show that if the 
channel impulse response is sufficiently sparse, the transmitted 
signal can be recovered reliably. 

I. Introduction 

In many communication systems, transmitted signals suffer 
from multipath interference [1]. These effects can be mitigated 
on the receiver end by estimating the channel response, either 
by using training (or pilot) symbols [2] or by blind equalization 
[3]. In a number of applications, the channel exhibits a sparse 
impulse response, i.e. a small nonzero support. In such cases, 
the sparse nature of the channel can be exploited to get 
better estimates [4]. In recent years several training based 
methods have been presented in this regard [4]-[8]. However, 
if the channel response is unknown, partially known, and/or 
changing rapidly, some other mechanism is needed to protect 
the transmitted signals from the channel interference. 

In this paper we demonstrate how a random coding scheme 
can protect an arbitrary signal from the effects of sparse 
channels. We show that if the original signal is encoded using 
a random matrix before transmission, then we can estimate the 
channel and recover the signal exactly (when there is no noise 
present) on the receiver side. Our scheme does not use any 
prior knowledge of the channel, other than that the impulse 
response of the channel is sparse. 

The problem can be formulated as follows. We want to 
transmit a signal x € R™ to a remote receiver. In order to 
protect the signal from the multipath interference, we instead 
transmit a codeword Ax, where A is an m x n random coding 
matrix with m > n. The received signal can be expressed as 



y = Ax ® h, 



(1) 



where h £ M. m is a fc-sparse vector (i.e., contains only k 
nonzero entries) for the impulse response of the channel and 



<g> denotes the circular convolution. Our goal is to recover the 
original signal x from the received signal y. 

Few comments about the problem formulation are in order. 
The coding matrix A can be generated by drawing its entries 
independently from a subgaussian distribution, e.g., Gaussian 
or Bernoulli [9]. However, in this paper, we assume that the 
coding matrix A is Gaussian, whose entries are independently 
drawn from the normal distribution N(0, 1/m). Although we 
use circular convolution in (1), the setup and subsequent 
discussion can be easily extended to the linear convolution. 
With our assumption that h is sparse, for the recovery we can 
use methods from sparse signal approximation literature [10], 
in particular, compressive sensing [11], [12]. 

In general, compressive sensing deals with the problems 
involving the recovery of sparse signals from small number of 
linear measurements. Consider the following setup. Assume 
that we want to recover a sparse signal a e R™ from its 
linear measurements b = <f>a, where $ is an m x n matrix 
with m <C n. Ideally, we would like to solve the following 
optimization problem 

minimize ||S||o subject to b = <I>a, (2) 

where ||a|| denotes the £ norm (i.e., cardinality) of the 
optimization variable a. Unfortunately, the problem in (2) 
is combinatorial in nature, and known to be NP-hard [13]. 
However, a relaxation of £q norm with £\ norm makes the 
optimization problem convex, and can be solved efficiently 
[14]. The resultant £\ optimization problem (also known as 
basis pursuit) can be written as 

minimize 1 1 ad! subject to b = <M. (3) 

If the original signal a is fc-sparse and <f> obeys some 
incoherence conditions (e.g., $ is Gaussian or Bernoulli), 
then the solution of (3) is exact (with high probability) if 
m= 0{k\og(n/k)) [15]. 

A related problem in compressive sensing, where random 
coding is used, is the £i decoding - an error correction scheme 
[16]. In this framework we use random coding to protect an 
arbitrary signal from the "sparse additive" errors. The problem 
setup is as follows. Assume that we want to reliably transmit 
a signal x e R™ to the receiver. In order to do so, we transmit 
a codeword Ax, where A is an m x n random coding matrix 
with m > n. The received signal is 



y = Ax + e, 



(4) 



where e e M. m is the sparse error introduced by the channel. 
If e contains k nonzero terms and m — n = 0(klog(m/k)), 
then solving the following optimization program will recover 
x exactly (with very high probability) [16], [17] 

minimize \\Ax — y||i- (5) 

In this paper we use random coding to protect an arbitrary 
signal from the effects of a "sparse convolutive" channel. We 
present two schemes for the signal recovery. The first scheme 
formulates the recovery problem as a block sparse signal 
recovery problem. The second scheme is an improvement over 
block sparse method and it uses an alternating minimization 
algorithm to recover x and h. We also present some simulation 
results to demonstrate the performance of random coding. 

II. Block 4 

In this section we discuss the block-sparse formulation to 
estimate x and h in (1). The system in (1) can be written as 

m 

y = Y,H^)S l - 1 Ax, (6) 
»=i 

where S z denotes an operator for i circular downward shifts 
of the columns, h(i) is the ith element of h. By noting that 
h(i)x is simply the ith column of the rank-1 matrix U = 
xh T e R nxm , we may rewrite this as: 

y = AU, (7) 

where the U is produced from the vectorization operation 
U = [f/(l, 1), {7(2, 1), U(n, m)] T , and A is an m x run 
matrix with n circular shifted copies of A stacked together: 
A = [A S 1 A S 2 A ■■■ S^A]. This vector U is 
called block-sparse because the nonzero elements occur in 
particular contiguous regions (or blocks) of n elements inside 
this vector corresponding to the nonzero columns {7, = h(i)x 
(V i : h(i) ^ 0). 

Knowing that U is fcn-sparse, we could solve for U using 
the standard t\ minimization (3). However, the block-sparse 
form lends itself to much more sophisticated approaches that 
are possible when the sparsity has a structured model [18]. In 
block-sparse signals, the nonzero entries are clustered together 
[19]. This way, if h is fc-sparse, we may consider U as 
a fc-block-sparse signal and solve the following block l\ 
optimization program to estimate U from y 

m 

minimize ^2\\U t \\ 2 subject to y = AU. (8) 

»=i 

This is a convex program and can be solved as a second order 
cone program or semi definite program [14]. The performance 
guarantees for the block-sparse recovery have been presented 
in [19]-[21]. 

It follows from recent results in compressive sensing the- 
ory that with a block Gaussian circulant matrix A as in 
(7), the solution of (8) is exact with high probability if 
m = O (kn log 5 (mn)) [22], [23]. This means that in order 



to recover x exactly at the receiver side, we need n ele- 
ments in codeword for every single element in the channel 
impulse response. This multiplicative relation between the 
lengths of x and codeword Ax makes the block l\ scheme 
unattractive. However, this bound on the required length of 
the codeword is not optimal. Since there are only n + k 
unknown elements in (1), we would like to recover x and 
h with m = 0((n + k) log(m)). 

Although the block l\ approach improves upon naive l\ 
minimization, there are still some fundamental limitations of 
the block-sparse recovery method that hinder its performance. 
In particular, this formulation lacks the "rank one" constraint 
on the desired block-sparse solution U. Ideally we would like 
to solve the following optimization problem 

ra 

minimize \\Uj\\2 

i=l 

subject to y = AU 

rank(C/) = 1. 

Indeed, this rank constraint would critically reduce the number 
of degrees of freedom from 0(nk) to 0(n + k), enabling 
solutions with substantially fewer measurements. However, 
this constraint is a non-convex problem and solving such a 
program is known to be NP-hard [24]. This rank constraint 
can be relaxed (e.g. with a nuclear norm constraint), and we 
discuss such approaches in another paper [25]. 

III. Alternating minimization 

In this section we discuss an alternating minimization 
algorithm to recover x and h from the measurements y 
in (1). Before getting into the discussion about alternating 
minimization, consider the following two simpler problems. 
Suppose we already know the channel response h and we want 
to estimate x from y. In this case we can write the system in 
(1) as 

y = Hx, (9) 

where H = h <g> A = Yh=i h{i)S^ x A is an m x n 
matrix whose columns are circular convolution of h with the 
columns of A. Assuming that H has full column rank, x can 
be computed exactly by solving the following least squares 
problem 

minimize \\Hx — y\\\. (10) 

We can write H = FT,tF*A, where F* denotes a DFT matrix 
and is a diagonal matrix consisting of h = F*h. In order 
to recover x exactly (using the perfect knowledge of h), the 
rank of matrix H should be at least n. This implies that h 
must have at least n non-zero entries. The discrete uncertainty 
principle [26] tells us that for any m-dimensional signal h 
and its Fourier transform h, the following relation holds for 
the time and frequency support sizes 

INo + IHIo >2^. 



The quantitative robust uncertainty principle [27] improves on 
this bound and suggests that, for almost every h, the following 
relation holds for the time and frequency support sizes 
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Therefore, even when we know the channel response h exactly, 
to reliably recover x in (9) we need m > (n + k) log to. 

On the other hand, suppose we know x and we want to 
estimate h from y. The system in (1) can be written as 



y=Ch, 



(11) 



where C is the circulant matrix created from the vector Ax. 
The sparse vector h can be estimated using any standard 
sparse recovery method. Since A is Gaussian matrix and x 
is a deterministic signal, Ax is a Gaussian vector and C is 
a Gaussian circulant matrix. Therefore, (with the complete 
knowledge of x) we can recover h exactly by solving the 
basis pursuit problem if to > fclog 5 (m.) [28]. 

Now consider the original problem, where we do not know 
x and h, and the only prior information we have is that h is 
sparse. In this regard, we wish to solve the following joint 
optimization problem for x and h 



minimize ^ 1 1 h ® Ax — y\\\ + r||ft||i, 
x,h 2 



(12) 



where r > is a regularization parameter. There are two 
costs involved in (12). The quadratic term in (12), known as 
the residual, tries to keep estimates close to the measurements. 
The l\ norm works towards making the estimate of channel 
sparse. Note that for a fixed value of h, (12) reduces to the least 
squares problem (10). On the other hand, for a fixed value of x 
(12) becomes convex in h, and takes the form of a well studied 
sparse recovery problem - basis pursuit denoising (BPDN) 
[10]. However, the optimization problem (12) is not convex 
in both x and h simultaneously, which makes it a challenging 
task to solve this problem. We use the alternating minimization 
scheme to solve (12) for x and h. 

The alternating minimization (also known as projection 
method) is a commonly used method to solve an optimiza- 
tion problem over two variables [29]. In the framework of 
alternating minimization, instead of solving the optimization 
problem over multiple variables simultaneously, we solve a 
sequence of optimization problems over one variable at a time. 
This scheme of "alternating the minimization variables" has 
been used in a variety of applications. Examples include blur 
identification and image restoration, blind source separation, 
adaptive filter design and matrix factorization. 

The alternating minimization (AM) algorithm for (12) is an 
iterative scheme. Every AM iteration can be divided into two 
main steps, where we alternately minimize (12) over h and x. 
We use subscript j to denote the variables at jth iteration of 



Algorithm 1 Alternating minimization for channel protection 
Start with the initial solutions xo and h for (12) at j = 0. 
repeat 

j <- 3 + 1 

Select the value of tj (or the cardinality kj for the channel 
estimate). 

Fix x = Xj-i in (12) and solve it as (13) to compute hj. 
Fix h = hj in (12) and solve it as (14) to compute xj. 

Normalize Xj as Xj — T . — ^— . 

INI2 

until stopping criterion is satisfied 



the algorithm. Assume that Xj-i and hj-i are the solutions 
at the beginning of jth iteration of AM. The two main steps 
of the AM iteration can be described as follows. 

1) With the signal x fixed as Xj-\, update the channel 
estimate hj by solving (12) over h as 

minimize ^||Cj/i — y\\\ + Tj\\h\\i, (13) 

where Tj is the regularization parameter and Cj is the 
circulant matrix created from the vector Axj-i. 

2) With channel h fixed as hj, update the signal estimate Xj 
by solving (12) over x as 



minimize \\Hjx — y\\ 2 , 
where Hj = hj®A = J2T=i hj^S^A. 



(14) 



Repeat the procedure of updating hj and Xj until some 
convergence criterion is satisfied. 

A sketch for the AM algorithm is given in Algorithm 1 . Here 
we discuss some important features of the AM algorithm in 
detail. 

Nonconvexity: The optimization problem (12) is nonconvex, 
and it allows several possible solutions (i.e., the local minima). 
The final solution where Algorithm 1 converges depends on 
several factors. The most important factors in this regard are 
the choices for the initial values of x and h and the value 
of regularization parameter Tj at every iteration. There can be 
many ways to initialize the Algorithm 1 and then control the 
regularization parameter Tj, so that the Algorithm 1 converges 
to the global minimum for (12) (which is often at the correct 
solution). In the following discussion, we restrict ourselves to 
one such setup, where we start with an initial channel estimate 
which has only one nonzero entry and slowly add more entries 
in the channel estimate as we move towards the final solution. 
Initialization: We start Algorithm 1 by identifying the 
strongest path (component) in the channel response. The initial 
channel estimate ho is selected as a vector which is one 
at the location corresponding to the strongest path and zero 
everywhere else. This way the initial estimate of the channel 
becomes just a delay function. The initial value for xo can 
then be computed as the least squares estimate, using ho in 
(14). The initial value of channel estimate h can either be 
known as a prior knowledge (e.g., the line of sight path is 
dominant) or it can be estimated from the received signal y 



as follows. We can identify the strongest path in the channel, 
and consequently xq and ho, by solving the following least 
squares problem for every possible delay i as 



minimize 



S^Ax-yWi for i e {l,...,m}. (15) 



The delay which gives the smallest residual in (15) corre- 
sponds to the strongest path in the channel. This ultimately 
gives us the values for the initial estimates xq and ho. This 
scheme of finding the strongest path can be considered as a 
generalized matched filter. 

Selection of tj or cardinality kj : The next important task is 
the selection of the regularization parameter tj in (13). At any 
jth iteration of Algorithm 1, we wish to solve (13) such that 

(i.e., number of nonzero 



the solution hj has cardinality kj 



entries) 

final solution, e.g 



We slowly increase k 



j as we proceed towards the 
for some constant r > 1. The 



value of Tj in (13), to a large extent, controls the cardinality 



of the solution hj 



For example, hj is zero if Tj 



>\\C 



j y\\oo, 

and nonzero entries start to appear in hj as we reduce Tj. 
However, it is difficult to predict the value of Tj which gives 
a kj -sparse solution of (13). Fortunately, we have a homotopy 
based algorithm to solve (13) which can provide a solution 
with the desired cardinality kj. The homotopy algorithm for 
(13) is itself an iterative scheme which starts with a zero vector 
and updates the solution and its support by adding or removing 
one element at a time [30], [31]. This way we can terminate the 
homotopy algorithm as soon as the cardinality of the homotopy 
solution becomes equal to kj. 

Normalizing Xj\ The last step of Algorithm 1 normalizes the 
least squares solution Xj to unit norm. Since the variables x 
and h are coupled together in the bilinear form (1), the magni- 
tudes of the estimates Xj and hj are inversely proportional to 
each other. At the same time, the optimization problem (13) 
minimizes £\ norm of hj . This reduces the magnitude of hj at 
every iteration and the magnitude of Xj increases at the same 
rate. The normalization of either Xj or hj brings a balance 
between their magnitudes. 

Stopping criteria: We use the value of the residual and 
cardinality of channel estimate as an indicator for conver- 
gence. Some simple examples for the stopping criteria are: 

1) Terminate Algorithm 1 after a fixed number of iterations, 

2) Terminate when the cardinality kj of channel estimate hj 
reaches some threshold, or 3) Terminate when the residual 
|| hj ® Axj — j/ 1| 2 reduces below some threshold. 
Computational cost: The main computational cost for Algo- 
rithm 1 involves solving (13) and (14) at every iteration. We 
can reduce the computational burden of solving these problems 
if we store QR factors for A at the initialization and then use 
QR factors of A and FFT operations to solve these problems. 

IV. Simulation 

In this section we present the simulation results to demon- 
strate the efficiency of the proposed random coding scheme. 
The simulation setup is as follows. We start with a signal 
x e W l whose entries are drawn from standard normal 



distribution N(0, 1). We use an m x n Gaussian matrix as the 
coding matrix A, whose entries are drawn from iV(0, 1/to). 
The received signal y is generated according to the model 
(1), where h <E K m is a fc-sparse vector which is nonzero at 
randomly chosen k locations. The nonzero entries of h are 
also drawn from N(0, 1). The signal x and channel response 
h is then estimated from y, by solving (12), using alternating 
minimization scheme in Algorithm 1. 

We tested Algorithm 1 for the exact recovery of x and h 
with different values of m, n and k. For every simulation, we 
determined if the error had converged to zero (with some small 
tolerance) within some preset number of iterations. In these 
simulations we assumed that we already know the location of 
the strongest component in the channel response h, instead 
of solving (15). The average recovery results for m = 256 
and m — 512 are presented in the form of phase diagrams 
in Figure 1. We ran 10 simulations on each 64 locations of 
interest for the m = 256 case and on 100 locations for the 
m = 512 case to produce these interpolated figures. 

It can be seen from these results that if the underlying 
channel is sufficiently sparse, then we can identify the channel 
and estimate the signal correctly by using the random coding. 

V. Discussion 

We presented a random coding scheme to protect an ar- 
bitrary signal from the effects of multipath interference. We 
jointly estimate the channel and the transmitted signal by 
formulating the recovery problem as an optimization problem. 
We presented two recovery schemes in this regard - block i\ 
and alternating minimization. 

In future, we want to develop some theoretical guarantees 
for when the recovery of x and h is possible in (1). As we 
discussed in section III that if we know either x or h, we can 
solve (12) to recover the unknown variable exactly when m s=a 
n+k. However, these theoretical results do not apply when we 
do not know x and h. We first want to determine the general 
conditions when the recovery of x and h is even possible. 
Then we want to explore the algorithms which can can reliably 
recover x and h under those conditions. AM algorithm is one 
promising option in this direction. 

Note that our model in (1) does not assume any noise in 
the received signal. A more realistic model for the received 
signal would be 

y = h® Ax + v, 

where v is a noise vector. The optimization problem (12) is 
robust to noise by its construction. Although we cannot recover 
x and h exactly from the noisy measurements here, but a good 
estimate can be made if noise level is not too large. The block 
l\ method can be modified for the noisy measurements by 
replacing the equality constraint in (8) with a data fidelity 
constraint, e.g., \\AU — y\\2 < 7 for some 7 > [21]. 

An important step in the AM algorithm is solving (13), 
at every iteration, for the desired cardinality of the solution. 
The homotopy scheme naturally lends itself to this iterative 
setup. Nonetheless, we can also use some other sparse recov- 
ery method, which can provide a solution with the desired 
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Fig. 1, Phase diagram for the performance of AM algorithm for (a) m = 256 
and (b) m = 512. The intensity of the map gives the interpolated probability 
that the Algorithm 1 gives exact recovery for the respective values of n and 
k, ranging from black (probability 0) to white (probability 1). The contour 
line at 95% success rate is shown in solid (red). 



cardinality. Two such methods are iterative thresholding [32], 
[33] and CoSaMP [34]. 
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